Overview

Dataset statistics

Number of variables13
Number of observations5477006
Missing cells0
Missing cells (%)0.0%
Duplicate rows1410
Duplicate rows (%)< 0.1%
Total size in memory1.1 GiB
Average record size in memory220.0 B

Variable types

Numeric10
Categorical3

Warnings

Dataset has 1410 (< 0.1%) duplicate rowsDuplicates
date has a high cardinality: 1075 distinct values High cardinality
time has a high cardinality: 86400 distinct values High cardinality
geo_lon is highly correlated with regionHigh correlation
region is highly correlated with geo_lonHigh correlation
level is highly correlated with levelsHigh correlation
levels is highly correlated with levelHigh correlation
rooms is highly correlated with areaHigh correlation
area is highly correlated with roomsHigh correlation
price is highly correlated with areaHigh correlation
geo_lon is highly correlated with regionHigh correlation
region is highly correlated with geo_lonHigh correlation
level is highly correlated with levelsHigh correlation
levels is highly correlated with levelHigh correlation
rooms is highly correlated with areaHigh correlation
area is highly correlated with price and 2 other fieldsHigh correlation
kitchen_area is highly correlated with areaHigh correlation
geo_lon is highly correlated with regionHigh correlation
region is highly correlated with geo_lonHigh correlation
rooms is highly correlated with areaHigh correlation
area is highly correlated with roomsHigh correlation
geo_lat is highly correlated with geo_lonHigh correlation
geo_lon is highly correlated with geo_lat and 1 other fieldsHigh correlation
region is highly correlated with geo_lonHigh correlation
level is highly correlated with levelsHigh correlation
levels is highly correlated with level and 1 other fieldsHigh correlation
building_type is highly correlated with levelsHigh correlation
area is highly skewed (γ1 = 57.05613875) Skewed
kitchen_area is highly skewed (γ1 = 452.5307552) Skewed
building_type has 307165 (5.6%) zeros Zeros

Reproduction

Analysis started2021-08-07 18:20:20.529571
Analysis finished2021-08-07 18:25:56.802176
Duration5 minutes and 36.27 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

price
Real number (ℝ)

HIGH CORRELATION

Distinct352726
Distinct (%)6.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4422029.023
Minimum-2144967296
Maximum2147483647
Zeros23
Zeros (%)< 0.1%
Negative365
Negative (%)< 0.1%
Memory size41.8 MiB
2021-08-07T21:25:56.913606image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum-2144967296
5-th percentile1150000
Q11950000
median2990000
Q34802000
95-th percentile11395000
Maximum2147483647
Range4292450943
Interquartile range (IQR)2852000

Descriptive statistics

Standard deviation21507519.15
Coefficient of variation (CV)4.863721844
Kurtosis6278.420808
Mean4422029.023
Median Absolute Deviation (MAD)1242200
Skewness-14.22845762
Sum2.421947949 × 1013
Variance4.625733802 × 1014
MonotonicityNot monotonic
2021-08-07T21:25:57.044432image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
250000065745
 
1.2%
230000059842
 
1.1%
350000057676
 
1.1%
220000057505
 
1.0%
210000056654
 
1.0%
165000054242
 
1.0%
180000053363
 
1.0%
260000053131
 
1.0%
240000051834
 
0.9%
175000051529
 
0.9%
Other values (352716)4915485
89.7%
ValueCountFrequency (%)
-21449672962
 
< 0.1%
-21149672962
 
< 0.1%
-2114150296159
< 0.1%
-20949672964
 
< 0.1%
-20899672961
 
< 0.1%
-205385029613
 
< 0.1%
-20417572964
 
< 0.1%
-20407422968
 
< 0.1%
-19649672961
 
< 0.1%
-19449672966
 
< 0.1%
ValueCountFrequency (%)
21474836472
 
< 0.1%
20894777042
 
< 0.1%
208329000042
< 0.1%
20500000004
 
< 0.1%
20000003001
 
< 0.1%
20000000001
 
< 0.1%
19900000001
 
< 0.1%
19453827041
 
< 0.1%
19225800004
 
< 0.1%
19040327041
 
< 0.1%

date
Categorical

HIGH CARDINALITY

Distinct1075
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size350.0 MiB
2020-03-27
 
44764
2019-02-28
 
22362
2020-02-01
 
21815
2018-09-18
 
21696
2018-12-31
 
21235
Other values (1070)
5345134 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters54770060
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique53 ?
Unique (%)< 0.1%

Sample

1st row2018-02-19
2nd row2018-02-27
3rd row2018-02-28
4th row2018-03-01
5th row2018-03-01

Common Values

ValueCountFrequency (%)
2020-03-2744764
 
0.8%
2019-02-2822362
 
0.4%
2020-02-0121815
 
0.4%
2018-09-1821696
 
0.4%
2018-12-3121235
 
0.4%
2020-09-0121132
 
0.4%
2021-04-2720179
 
0.4%
2021-04-3020147
 
0.4%
2019-04-0119759
 
0.4%
2020-10-0118686
 
0.3%
Other values (1065)5245231
95.8%

Length

2021-08-07T21:25:57.316550image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2020-03-2744764
 
0.8%
2019-02-2822362
 
0.4%
2020-02-0121815
 
0.4%
2018-09-1821696
 
0.4%
2018-12-3121235
 
0.4%
2020-09-0121132
 
0.4%
2021-04-2720179
 
0.4%
2021-04-3020147
 
0.4%
2019-04-0119759
 
0.4%
2020-10-0118686
 
0.3%
Other values (1065)5245231
95.8%

Most occurring characters

ValueCountFrequency (%)
013884262
25.4%
211188298
20.4%
-10954012
20.0%
18638985
15.8%
93308723
 
6.0%
81672331
 
3.1%
31459531
 
2.7%
41043895
 
1.9%
7947096
 
1.7%
6858651
 
1.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number43816048
80.0%
Dash Punctuation10954012
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
013884262
31.7%
211188298
25.5%
18638985
19.7%
93308723
 
7.6%
81672331
 
3.8%
31459531
 
3.3%
41043895
 
2.4%
7947096
 
2.2%
6858651
 
2.0%
5814276
 
1.9%
Dash Punctuation
ValueCountFrequency (%)
-10954012
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common54770060
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
013884262
25.4%
211188298
20.4%
-10954012
20.0%
18638985
15.8%
93308723
 
6.0%
81672331
 
3.1%
31459531
 
2.7%
41043895
 
1.9%
7947096
 
1.7%
6858651
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII54770060
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
013884262
25.4%
211188298
20.4%
-10954012
20.0%
18638985
15.8%
93308723
 
6.0%
81672331
 
3.1%
31459531
 
2.7%
41043895
 
1.9%
7947096
 
1.7%
6858651
 
1.6%

time
Categorical

HIGH CARDINALITY

Distinct86400
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Memory size339.5 MiB
16:15:49
 
197
06:32:22
 
190
06:36:17
 
189
06:32:06
 
186
06:32:15
 
184
Other values (86395)
5476060 

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters43816048
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20:00:21
2nd row12:04:54
3rd row15:44:00
4th row11:24:52
5th row17:42:43

Common Values

ValueCountFrequency (%)
16:15:49197
 
< 0.1%
06:32:22190
 
< 0.1%
06:36:17189
 
< 0.1%
06:32:06186
 
< 0.1%
06:32:15184
 
< 0.1%
06:32:23184
 
< 0.1%
06:36:02183
 
< 0.1%
06:32:12183
 
< 0.1%
06:36:15183
 
< 0.1%
06:33:50182
 
< 0.1%
Other values (86390)5475145
> 99.9%

Length

2021-08-07T21:25:57.589612image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
16:15:49197
 
< 0.1%
06:32:22190
 
< 0.1%
06:36:17189
 
< 0.1%
06:32:06186
 
< 0.1%
06:32:15184
 
< 0.1%
06:32:23184
 
< 0.1%
06:36:02183
 
< 0.1%
06:32:12183
 
< 0.1%
06:36:15183
 
< 0.1%
06:33:50182
 
< 0.1%
Other values (86390)5475145
> 99.9%

Most occurring characters

ValueCountFrequency (%)
:10954012
25.0%
16413260
14.6%
05459482
12.5%
24038374
 
9.2%
33554841
 
8.1%
43443261
 
7.9%
53386306
 
7.7%
61758466
 
4.0%
81614860
 
3.7%
71610957
 
3.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number32862036
75.0%
Other Punctuation10954012
 
25.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
16413260
19.5%
05459482
16.6%
24038374
12.3%
33554841
10.8%
43443261
10.5%
53386306
10.3%
61758466
 
5.4%
81614860
 
4.9%
71610957
 
4.9%
91582229
 
4.8%
Other Punctuation
ValueCountFrequency (%)
:10954012
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common43816048
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
:10954012
25.0%
16413260
14.6%
05459482
12.5%
24038374
 
9.2%
33554841
 
8.1%
43443261
 
7.9%
53386306
 
7.7%
61758466
 
4.0%
81614860
 
3.7%
71610957
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII43816048
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
:10954012
25.0%
16413260
14.6%
05459482
12.5%
24038374
 
9.2%
33554841
 
8.1%
43443261
 
7.9%
53386306
 
7.7%
61758466
 
4.0%
81614860
 
3.7%
71610957
 
3.7%

geo_lat
Real number (ℝ≥0)

HIGH CORRELATION

Distinct448318
Distinct (%)8.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean54.03826356
Minimum41.4590611
Maximum71.9803994
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size41.8 MiB
2021-08-07T21:25:57.703950image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum41.4590611
5-th percentile44.8950433
Q153.3776756
median55.171385
Q356.2261305
95-th percentile59.96686658
Maximum71.9803994
Range30.5213383
Interquartile range (IQR)2.8484549

Descriptive statistics

Standard deviation4.622757917
Coefficient of variation (CV)0.08554601153
Kurtosis0.1878352641
Mean54.03826356
Median Absolute Deviation (MAD)1.1407398
Skewness-0.981604419
Sum295967893.7
Variance21.36989076
MonotonicityNot monotonic
2021-08-07T21:25:57.831911image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
55.0303931106177
 
1.9%
55.01410829577
 
0.5%
55.012868426030
 
0.5%
54.947140721055
 
0.4%
59.93908420873
 
0.4%
55.01775618885
 
0.3%
55.017672317999
 
0.3%
55.013993916518
 
0.3%
55.012840215983
 
0.3%
55.794358413626
 
0.2%
Other values (448308)5190283
94.8%
ValueCountFrequency (%)
41.45906112
 
< 0.1%
41.45908940
< 0.1%
41.61687771
 
< 0.1%
41.617528139
< 0.1%
41.62013342
 
< 0.1%
41.67409721
 
< 0.1%
41.67481711
 
< 0.1%
41.6773462
 
< 0.1%
41.6774111
 
< 0.1%
41.69224393
 
< 0.1%
ValueCountFrequency (%)
71.98039941
 
< 0.1%
71.63896991
 
< 0.1%
71.63625121
 
< 0.1%
71.6342551
 
< 0.1%
70.62060331
 
< 0.1%
69.63673711
 
< 0.1%
69.49940272
 
< 0.1%
69.49870913
< 0.1%
69.49849291
 
< 0.1%
69.49840495
< 0.1%

geo_lon
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct449701
Distinct (%)8.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean53.24433248
Minimum19.890196
Maximum162.5360775
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size41.8 MiB
2021-08-07T21:25:57.966328image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum19.890196
5-th percentile30.318554
Q137.777895
median43.067741
Q365.6489496
95-th percentile83.6773749
Maximum162.5360775
Range142.6458815
Interquartile range (IQR)27.8710546

Descriptive statistics

Standard deviation20.74762836
Coefficient of variation (CV)0.3896682969
Kurtosis-0.3778285941
Mean53.24433248
Median Absolute Deviation (MAD)8.7025264
Skewness0.8397470317
Sum291619528.4
Variance430.4640825
MonotonicityNot monotonic
2021-08-07T21:25:58.095421image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
83.0155452106177
 
1.9%
83.001661529577
 
0.5%
82.999998726030
 
0.5%
82.958596121055
 
0.4%
30.31587920873
 
0.4%
83.00357818886
 
0.3%
83.00352217999
 
0.3%
83.003319416518
 
0.3%
83.001886215983
 
0.3%
49.111497513626
 
0.2%
Other values (449691)5190282
94.8%
ValueCountFrequency (%)
19.8901962
< 0.1%
19.90309464
< 0.1%
19.90393111
 
< 0.1%
19.90463311
 
< 0.1%
19.905441
 
< 0.1%
19.9063821
 
< 0.1%
19.90646551
 
< 0.1%
19.90675181
 
< 0.1%
19.90826482
< 0.1%
19.9130431
 
< 0.1%
ValueCountFrequency (%)
162.53607751
< 0.1%
161.33299361
< 0.1%
161.32859731
< 0.1%
161.32784361
< 0.1%
161.32443571
< 0.1%
159.83671871
< 0.1%
158.71332861
< 0.1%
158.71265562
< 0.1%
158.71030411
< 0.1%
158.69903521
< 0.1%

region
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct84
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4307.140936
Minimum3
Maximum61888
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size41.8 MiB
2021-08-07T21:25:58.233512image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile3
Q12661
median2922
Q36171
95-th percentile9654
Maximum61888
Range61885
Interquartile range (IQR)3510

Descriptive statistics

Standard deviation3308.050175
Coefficient of variation (CV)0.7680385258
Kurtosis-0.7214476444
Mean4307.140936
Median Absolute Deviation (MAD)2360
Skewness0.5898172921
Sum2.359023675 × 1010
Variance10943195.96
MonotonicityNot monotonic
2021-08-07T21:25:58.354741image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
96541049435
19.2%
2843637224
11.6%
81500368
 
9.1%
2661461820
 
8.4%
3439511
 
8.0%
6171237289
 
4.3%
2922230545
 
4.2%
3230222652
 
4.1%
5282155645
 
2.8%
3991141633
 
2.6%
Other values (74)1400884
25.6%
ValueCountFrequency (%)
3439511
8.0%
6977
 
< 0.1%
81500368
9.1%
8212519
 
< 0.1%
101048396
 
0.9%
14913857
 
0.1%
190112
 
< 0.1%
207263128
 
1.2%
23288160
 
0.1%
235922216
 
0.4%
ValueCountFrequency (%)
618885
 
< 0.1%
16705139
 
< 0.1%
14880357
 
< 0.1%
14368593
 
< 0.1%
139199913
0.2%
13913735
 
< 0.1%
13098256
 
< 0.1%
119916382
0.1%
114165243
0.1%
1117111654
0.2%

building_type
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.948966278
Minimum0
Maximum5
Zeros307165
Zeros (%)5.6%
Negative0
Negative (%)0.0%
Memory size41.8 MiB
2021-08-07T21:25:58.459834image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median2
Q33
95-th percentile3
Maximum5
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.038536638
Coefficient of variation (CV)0.5328653703
Kurtosis-0.9864830136
Mean1.948966278
Median Absolute Deviation (MAD)1
Skewness0.03600047645
Sum10674500
Variance1.078558348
MonotonicityNot monotonic
2021-08-07T21:25:58.552996image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
11955661
35.7%
31892756
34.6%
21130731
20.6%
0307165
 
5.6%
4174356
 
3.2%
516337
 
0.3%
ValueCountFrequency (%)
0307165
 
5.6%
11955661
35.7%
21130731
20.6%
31892756
34.6%
4174356
 
3.2%
516337
 
0.3%
ValueCountFrequency (%)
516337
 
0.3%
4174356
 
3.2%
31892756
34.6%
21130731
20.6%
11955661
35.7%
0307165
 
5.6%

level
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct39
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.214529982
Minimum1
Maximum39
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size41.8 MiB
2021-08-07T21:25:58.664018image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median5
Q39
95-th percentile16
Maximum39
Range38
Interquartile range (IQR)7

Descriptive statistics

Standard deviation4.957419277
Coefficient of variation (CV)0.7977142746
Kurtosis2.208892357
Mean6.214529982
Median Absolute Deviation (MAD)3
Skewness1.42927485
Sum34037018
Variance24.57600589
MonotonicityNot monotonic
2021-08-07T21:25:58.785879image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=39)
ValueCountFrequency (%)
2707923
12.9%
1677583
12.4%
3610523
11.1%
5572040
10.4%
4565548
10.3%
6331771
 
6.1%
8306960
 
5.6%
7301103
 
5.5%
9297467
 
5.4%
10232850
 
4.3%
Other values (29)873238
15.9%
ValueCountFrequency (%)
1677583
12.4%
2707923
12.9%
3610523
11.1%
4565548
10.3%
5572040
10.4%
6331771
6.1%
7301103
5.5%
8306960
5.6%
9297467
5.4%
10232850
 
4.3%
ValueCountFrequency (%)
3928
 
< 0.1%
3840
 
< 0.1%
37105
 
< 0.1%
36151
 
< 0.1%
35136
 
< 0.1%
34262
 
< 0.1%
33772
< 0.1%
321102
< 0.1%
311162
< 0.1%
301417
< 0.1%

levels
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct39
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.39892032
Minimum1
Maximum39
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size41.8 MiB
2021-08-07T21:25:58.917292image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q15
median10
Q316
95-th percentile25
Maximum39
Range38
Interquartile range (IQR)11

Descriptive statistics

Standard deviation6.535733868
Coefficient of variation (CV)0.573364291
Kurtosis0.148753319
Mean11.39892032
Median Absolute Deviation (MAD)5
Skewness0.8296565373
Sum62431955
Variance42.7158172
MonotonicityNot monotonic
2021-08-07T21:25:59.041235image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=39)
ValueCountFrequency (%)
51002943
18.3%
10941125
17.2%
9734812
13.4%
17351218
 
6.4%
16308093
 
5.6%
25215201
 
3.9%
12166984
 
3.0%
3164814
 
3.0%
4162246
 
3.0%
18152007
 
2.8%
Other values (29)1277563
23.3%
ValueCountFrequency (%)
123722
 
0.4%
2117822
 
2.2%
3164814
 
3.0%
4162246
 
3.0%
51002943
18.3%
6109277
 
2.0%
763092
 
1.2%
876137
 
1.4%
9734812
13.4%
10941125
17.2%
ValueCountFrequency (%)
391970
 
< 0.1%
38520
 
< 0.1%
371904
 
< 0.1%
361319
 
< 0.1%
351961
 
< 0.1%
34916
 
< 0.1%
3317158
0.3%
328272
0.2%
314538
 
0.1%
305859
 
0.1%

rooms
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.726173387
Minimum-2
Maximum10
Zeros0
Zeros (%)0.0%
Negative306552
Negative (%)5.6%
Memory size41.8 MiB
2021-08-07T21:25:59.151994image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum-2
5-th percentile-1
Q11
median2
Q32
95-th percentile3
Maximum10
Range12
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.082132738
Coefficient of variation (CV)0.6268968956
Kurtosis0.9955762546
Mean1.726173387
Median Absolute Deviation (MAD)1
Skewness-0.2318678244
Sum9454262
Variance1.171011262
MonotonicityNot monotonic
2021-08-07T21:25:59.258221image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
12067013
37.7%
21827514
33.4%
31097354
20.0%
-1306209
 
5.6%
4152160
 
2.8%
522576
 
0.4%
62357
 
< 0.1%
7788
 
< 0.1%
8353
 
< 0.1%
-2343
 
< 0.1%
Other values (2)339
 
< 0.1%
ValueCountFrequency (%)
-2343
 
< 0.1%
-1306209
 
5.6%
12067013
37.7%
21827514
33.4%
31097354
20.0%
4152160
 
2.8%
522576
 
0.4%
62357
 
< 0.1%
7788
 
< 0.1%
8353
 
< 0.1%
ValueCountFrequency (%)
101
 
< 0.1%
9338
 
< 0.1%
8353
 
< 0.1%
7788
 
< 0.1%
62357
 
< 0.1%
522576
 
0.4%
4152160
 
2.8%
31097354
20.0%
21827514
33.4%
12067013
37.7%

area
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct12741
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean53.91824855
Minimum0.07
Maximum7856
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size41.8 MiB
2021-08-07T21:25:59.381707image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0.07
5-th percentile29
Q138
median48.02
Q363.13
95-th percentile93
Maximum7856
Range7855.93
Interquartile range (IQR)25.13

Descriptive statistics

Standard deviation33.352926
Coefficient of variation (CV)0.6185832606
Kurtosis8492.648526
Mean53.91824855
Median Absolute Deviation (MAD)11.98
Skewness57.05613875
Sum295310570.8
Variance1112.417673
MonotonicityNot monotonic
2021-08-07T21:25:59.510194image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4096602
 
1.8%
4595018
 
1.7%
4294570
 
1.7%
4494073
 
1.7%
4391727
 
1.7%
6088074
 
1.6%
3077211
 
1.4%
3875534
 
1.4%
3373824
 
1.3%
3272387
 
1.3%
Other values (12731)4617986
84.3%
ValueCountFrequency (%)
0.071
< 0.1%
0.221
< 0.1%
0.281
< 0.1%
0.321
< 0.1%
0.451
< 0.1%
0.482
< 0.1%
0.521
< 0.1%
0.532
< 0.1%
0.611
< 0.1%
0.641
< 0.1%
ValueCountFrequency (%)
78561
< 0.1%
76601
< 0.1%
76251
< 0.1%
7513.41
< 0.1%
71901
< 0.1%
6812.61
< 0.1%
65801
< 0.1%
59851
< 0.1%
5711.61
< 0.1%
56441
< 0.1%

kitchen_area
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED

Distinct4154
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.62839745
Minimum0.01
Maximum9999
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size41.8 MiB
2021-08-07T21:25:59.651660image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0.01
5-th percentile5
Q17
median9.7
Q312.7
95-th percentile20
Maximum9999
Range9998.99
Interquartile range (IQR)5.7

Descriptive statistics

Standard deviation9.792379875
Coefficient of variation (CV)0.9213411448
Kurtosis371626.0466
Mean10.62839745
Median Absolute Deviation (MAD)2.7
Skewness452.5307552
Sum58211796.61
Variance95.89070361
MonotonicityNot monotonic
2021-08-07T21:25:59.780552image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6520623
 
9.5%
9449543
 
8.2%
10369245
 
6.7%
8310615
 
5.7%
12288100
 
5.3%
7259294
 
4.7%
5256190
 
4.7%
11209125
 
3.8%
14158354
 
2.9%
13124148
 
2.3%
Other values (4144)2531769
46.2%
ValueCountFrequency (%)
0.0112
 
< 0.1%
0.022
 
< 0.1%
0.033
 
< 0.1%
0.043
 
< 0.1%
0.0527
 
< 0.1%
0.06168
< 0.1%
0.0771
 
< 0.1%
0.0858
 
< 0.1%
0.09209
< 0.1%
0.1111
< 0.1%
ValueCountFrequency (%)
99991
< 0.1%
82351
< 0.1%
65001
< 0.1%
62701
< 0.1%
49491
< 0.1%
30002
< 0.1%
25001
< 0.1%
21101
< 0.1%
19581
< 0.1%
15001
< 0.1%

object_type
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size304.5 MiB
1
3863809 
11
1613197 

Length

Max length2
Median length1
Mean length1.294539937
Min length1

Characters and Unicode

Total characters7090203
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row11
5th row1

Common Values

ValueCountFrequency (%)
13863809
70.5%
111613197
29.5%

Length

2021-08-07T21:25:59.999301image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-07T21:26:00.075298image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
ValueCountFrequency (%)
13863809
70.5%
111613197
29.5%

Most occurring characters

ValueCountFrequency (%)
17090203
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number7090203
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
17090203
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common7090203
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
17090203
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII7090203
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
17090203
100.0%

Interactions

2021-08-07T21:23:50.128541image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:23:51.496866image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:23:52.759814image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:23:53.910401image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:23:54.934615image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:23:56.022455image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:23:57.141128image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:23:58.240111image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:23:59.412903image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:00.575555image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:01.590134image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:02.642455image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:03.727829image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:04.820130image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:05.936726image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:07.074604image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:08.208375image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:09.340009image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:10.413458image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:11.547214image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:12.623864image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:13.690740image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:14.725687image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:15.834988image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:16.977347image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:18.071489image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:19.192765image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:20.300173image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:21.336287image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:22.346379image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:23.324198image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:24.390999image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:25.467808image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:26.479444image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:27.550900image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:28.607103image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:29.658652image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:30.775346image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:31.816723image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:32.843688image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:33.869976image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:34.929228image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:36.005500image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:37.074368image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:38.090907image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:39.164956image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:40.210900image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:41.275734image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:42.274390image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:43.264242image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:44.234653image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:45.313870image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:46.403886image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:47.490688image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:48.575392image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:49.625477image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:50.685316image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:51.759305image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:52.772930image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:53.788350image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:54.805745image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:55.891690image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:56.982199image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:58.075308image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:24:59.180565image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:25:00.267220image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:25:01.309681image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:25:02.385456image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:25:03.400742image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:25:04.423975image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:25:05.445142image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:25:06.522092image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:25:07.614789image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:25:08.695082image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:25:09.789571image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:25:10.836482image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:25:11.891565image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:25:12.946259image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:25:13.939685image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:25:14.951434image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:25:15.941508image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:25:17.008642image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:25:18.080003image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:25:19.166278image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:25:20.252738image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:25:21.357481image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:25:22.456461image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:25:23.551600image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:25:24.490674image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:25:25.510865image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:25:26.524220image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:25:27.541388image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:25:28.594060image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:25:29.671650image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:25:30.743402image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:25:31.819458image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:25:32.884171image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:25:33.985728image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:25:35.005379image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-08-07T21:25:35.950681image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Correlations

2021-08-07T21:26:00.153194image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-08-07T21:26:00.346868image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-08-07T21:26:00.518339image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-08-07T21:26:00.692974image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-08-07T21:25:38.006331image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
A simple visualization of nullity by column.
2021-08-07T21:25:41.770631image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

pricedatetimegeo_latgeo_lonregionbuilding_typelevellevelsroomsareakitchen_areaobject_type
060500002018-02-1920:00:2159.80580830.37614126611810382.610.81
186500002018-02-2712:04:5455.68380737.297405813524269.112.01
240000002018-02-2815:44:0056.29525044.0616372871159366.010.01
318500002018-03-0111:24:5244.99613239.074783284341216238.05.011
454500002018-03-0117:42:4355.91876737.9846428131314260.010.01
533000002018-03-0221:18:4255.90825337.72644881145132.06.01
647042802018-03-0412:35:2555.62109737.43100232125131.76.011
736000002018-03-0420:52:3859.87552630.3954572661125131.16.01
833900002018-03-0507:07:0553.19503150.10695231062424264.013.011
928000002018-03-0609:57:1055.73697238.846457811910255.08.01

Last rows

pricedatetimegeo_latgeo_lonregionbuilding_typelevellevelsroomsareakitchen_areaobject_type
547699664000002021-05-0120:13:4155.90429237.984368813417382.010.61
547699772000002021-05-0120:13:4259.77294730.0565303446223259.022.311
547699849000002021-05-0120:13:4359.85010330.3572992661125131.06.01
5476999128500002021-05-0120:13:4755.70128037.642654321224141.09.01
547700090000002021-05-0120:13:4844.05135742.86757329003454178.020.01
5477001197397602021-05-0120:13:5855.80473637.75089831817493.213.811
5477002125031602021-05-0120:14:0155.84141537.489624321732245.96.611
547700388000002021-05-0120:14:0456.28390944.07540828712417386.511.81
5477004118319102021-05-0120:14:1255.80473637.75089831833252.118.911
5477005133162002021-05-0120:14:1555.86024037.540356321023255.620.811

Duplicate rows

Most frequently occurring

pricedatetimegeo_latgeo_lonregionbuilding_typelevellevelsroomsareakitchen_areaobject_type# duplicates
82241650002019-08-0101:06:2955.82936037.81496831514139.18.715
1284156514452020-03-1207:34:0455.88094537.541180321524388.313.2115
1332179575272020-03-1211:20:5055.88158537.683930321324289.919.8115
1367192723272020-03-1210:08:3055.88158537.683930321724394.716.0115
1370193150392020-03-1207:54:3655.88158537.683930321624395.417.9115
1372194700032020-03-1207:54:3655.88158537.683930321724395.417.9115
1386205043892020-03-1211:01:0755.88158537.6839303218243103.912.4115
30721736002019-09-1116:58:1060.05717530.266266266121125120.95.0114
95149498802020-01-2214:43:4455.77085637.517985321322124.73.0114
110779358092020-03-1207:54:3755.88158537.68393032624137.513.1114